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Abstract — In this paper, a method is proposed for on-line 
monitoring of the control updating period in fast-gradient- 
based Model Predictive Control (MPC) schemes. Such schemes 
are currently under intense investigation as a way to accom- 
modate for real-time requirements when dealing with systems 
showing fast dynamics. The method needs cheap computations 
that use the algorithm on-line behavior in order to recover the 
optimal updating period in terms of cost function decrease. 
A simple example of constrained triple integrator is used to 
illustrate the proposed method and to assess its efficiency. 

I. INTRODUCTION 

A great amount of research during the last decade has been 
dedicated to the implementation of MPC schemes to systems 
showing fast dynamic. These are systems having character- 
istic time requiring updating periods that are incompatible 
with the complete solution of the underlying optimization 
problem. 

The solution that emerges with a wide consensus is to 
perform, during each control updating period, only a limited 
number of iterations of some descent method and to apply 
the so obtained sub-optimal solution during the next updating 
period. The iterations are then continued after a potential 
warm start at the next updating period leading hopefully to 
asymptotical recovering the optimal solution. 

In [2], the author showed that this implementation frame- 
work enhances a generic trade-off in terms of the control 
updating period. Indeed, if the latter is too small, the decrease 
in the cost function is not sufficient to compensate for the 
increase that may be due to model uncertainties (including 
those related to the potentially unknown dynamic of the set- 
point to be tracked). On the other hand, For too large control 
updating periods, the lack of fresh information means that 
the optimizer would solve with a high precision a rather out- 
of-date optimization problem. 

In the present paper, the generic framework proposed in 
[2] is adapted to the specific context where the iterations are 
defined by gradient or fast gradient algorithms (see for in- 
stance [3], [4], [8] and the reference therein). More precisely, 
while in [2], attempt to identify the so-called efficiency map 
that characterizes the descent method is considered, here the 
number of iterations (and hence the control updating period) 
is adapted locally based on the estimation of the gradient 
of the settling time under a contraction constraint on the 
dynamic of the cost function. 

This method can be applied as such when pure gradient 
descent method is applied to distribute the computation 
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of the optimal control sequence. In the case where a fast 
gradient iterations are used, it is shown that the restarting 
mechanism proposed by [7] is necessary in order to obtain 
a monotonically decreasing cost function. 

The paper is organized as follows. First, the general 
framework proposed in [2] is recalled in section [Il]in order to 
set the notation and to underline the crucial role played by the 
so called iteration efficiency map involved in a small gain- 
like theorem. An ideal optimization problem is then derived 
in which the decision variable is the control updating period 
(or equivalently, the number of iterations before updating the 
control) while the cost function represents the settling time of 
the MPC closed-loop cost function evolution. In Section [III 
an updating rule for the control updating period is defined to- 
wards the minimization of this cost function and is based on 
the estimation of its gradient. This estimation is done using 
available data regarding the behavior of the algorithm. For 
the proposed updating rule to asymptotically solve the ideal 
optimization problem, a monotonicity property is needed 



for the descent method. Section [IV] recalls the fast gradient 
method and shows how restarting mechanism [7] can be used 
to enhance monotonic decrease of the fast gradient method 
making the proposed updating rule appropriate. Finally, in 
section [V| an illustrative example is given in order to show 
the efficiency of the proposed scheme in monitoring on- 
line the number of iterations of the fast gradient algorithm 
when used in real-time implementation of constrained MPC. 



Finally, section VI concludes the paper and gives hints for 
further investigation. 

II. THE IDEAL CONTROL UPDATING RULE 

A. Some definitions and notation 

In the present paper the framework of parameterized 
NMPC [1] is adopted for the sake of generality since the 
standard piece- wise constant control parameterization can be 
viewed as a particular instantiation of the following more 
general form: 



U pwc (p) := (tf)(|)),...,^(ri)GUC 



pAT-r, 



(1) 



where p £ R np is the vector of degrees of freedom while 
the u^\p) stands for the corresponding constant value of 
the control vector at the future z-th sampling period. 

Using the notation above, the model of the dynamic system 
can be given in the following general form: 



Vj£{l,...,iV} , x(k + j)=X(j,x(k),p) 



(2) 



where X(j, x, p) is the state reached by the model at the j-th 
future sampling instant when it starts at the state x(k) under 
the p.w.c sequence defined by p through ([T]) . 



Since the difference between the model used by the 
controller and the real system plays a crucial role in the 
sequel, the notation: 

is used hereafter to denote the state of the real system after 
j sampling periods, starting from x(k), under the control 
sequence defined by p and when some disturbance sequence 
w occurs. The shorter notation X r (j,x(k),p) is sometimes 
used (explicit mention of w is omitted) as no specific 
discussion is needed regarding the origin of w. 

Let us denote by J(p,x) the value of the cost function 
that is used to define a Model Predictive Controller (MPC). 
The following assumption is considered in the sequel which 
is always possible to meet by adding appropriate constant to 
the cost function 

Assumption 1: There is some J > such that 

V(p,aO, J(p,x)>J>0 (3) 
This assumption is needed since contraction ratios of the 
cost function at successive instants are used in the sequel in 
which the cost function may appear in the denominator. 

Obviously, such a cost function is defined for a given 
control sequence parameter p and for a given current state x 
since these quantities define the state and control trajectories 
of the system's model. The prediction horizon N is implicitly 
included in the expression of J as it is considered to be given 
once for all in the current paper. 

An ideal MPC is defined by u^(p(x)) [see {!])] where 
p(x) is the solution of the following constrained optimization 
problem: 

p(x) := min J(p,x) under C(p,x) < (4) 

where C(p,x) defines the problem constraints. 

B. Distributed-in-time Implementation of MPC 

Nowadays real-time MPC related investigations are jus- 
tified by the fact that the computation of p{x) would be 
impossible during a single sampling period. Rather a lim- 
ited number q of iterations of some descent method are 
performed. This can be denoted by: 

p (Q) =S q (p {0 \x) (5) 

where p^ is the parameter vector that is obtained after q 
iterations of some descent methocf] S starting from some 
initial value p^ . The map is obviously parameterized by x 
since the optimization problem ^ is also parameterized by 
the current state. 

Assumption 2: A single iteration of the subroutine S 
needs a fixed amount of computation time that is denoted 
hereafter by r c . 

Assumption 3: The period r c is used as the basic sam- 
pling period for the definition of the p.w.c control profile 
given by {!]). Namely each value u^\p) is maintained 
constant during r c time units. 

! The gradient and fast gradient iteration are considered in the present 
contribution. 



Therefore, if several iterations q are needed, in a sense 
that will be clearer in the sequel, the control parameter p 
cannot be updated before r u = qr c time units. r u may be 
called the control updating period. However, if the number 
of iterations q varies dynamically (as it is proposed in the 
current contribution) then it is no more possible to define a 
control updating period but only control updating instants t% 
such that: 

t u k+1 =t u k +q(t u k )-T c (6) 

where q(£%) is the number of iterations to be performed 
before the next updated value of the control parameter 
p(tt+i) * s delivered. Meanwhile, the previously computed 
control sequence: 

u^(p(tt)),...,u^\p(tt)) (7) 

is applied during the updating interval In the 

sequel, the following notation is used : 

r fc " = t u k+l -t u k = q{t u k ) ■ t c (8) 

Note that for this scheme to be possible, the following 
constraint is obviously needed: 

q(tl) <N (TV = is the prediction horizon) (9) 

since otherwise, the sequence invoked in ([7]) would not be 
available. 

To summarize, the implementation scheme can be de- 
scribed as follows: 

1) At initial instant q — 0, some initial parameter vector 
p(q) is chosen. An initial number of iterations qo = 
q(q) < N is also decided. 

2) During the updating interval [q , q = qor c ] the first qo 
elements of the control sequence U(p(q)) are applied. 

3) In parallel, the computation unit performs the follow- 
ing tasks during [q,q]'. 

a) Compute a model based prediction x{q) of the 
state at instant t\. 

b) Compute p(q) = S( q W(p+(t%), x (q)) where 
the initial guess is either equal to p{t$) 
(cold start) or equal to some warm start value 
that is associated to p{t$) by standard translation 
technique. 

4) At the updating instant q, decide the number of 
iterations q(q) to perform during the next updating 
interval [q,q + q(q)r c ]. The way this choice is 
done represents the contribution of the paper and is 
explained in the following sections. 

5) During the updating interval [q,q + q(q)T c ], apply 
the first q{q) elements of the sequence U{p{t\)) 
previously computed 

6) keep doing. . . 

C. Monitoring Control Updating As A Feedback Problem 

It comes out from the discussion above that an extended 
system can be defined in which the state is given by z := 



(p T ,x T ) T and for which the dynamics is given by: 



to define an extended state for instance). 



X (tt +l ) = x r (T*,x(tt), P (e k ),w) 



(10) 



p(t u k+1 ) = S^(p + (tt), X(T%,x(tt),p(t u k )) (ID 

m+i) 

or in a more condensed way using the extended state z: 



!(t u k+1 ) = F(z(t u k ),q(t u k ),w) 



(12) 



which can be viewed as a dynamical system in which z plays 
the role of state vector while q plays the role of control input. 
Moreover, the control objective that has to be achieved by the 
control input q is to steer the scalar output J(p,x) = J(z) 
to its minimum value as fast as possible. Note that whether 
this achieves the control objective depends on the adequacy 
of the original MPC formulation and lies out of the scope 
of the control input q. The latter aims simply at making the 
distributed-in-time optimization reaches asymptotically the 
behavior of the ideal MPC computation. 

In order to analyze the evolution of the cost function, the 
following notation is needed: 

j£ := J(p + (tj£), x The value of the cost function 
when the iterations start at the beginning of the updating 
interval J based on the predicted value x{t^ +1 ) of 



the future state at instant t 



tt + q(t u k )-T c 



Jk+i '■= J(p(tk+i)i ^(^jfc+i)) 1 The value of the cost function 
after q(t^) iterations based on the predicted value of x(t% +1 ) 
of the state. 

Jfc+i := J(p(t k+1 ),x(t k+1 )): The true value of the cost 
function at the new value of the state x(t k+1 ). 

Obviously, the convergence is tightly related to the 
ratio Jk+i/Jk which can be written in terms of the above 
quantities in the following form: 



Jk+l 
Jk 




(13) 



D k {q{tl)) 



where 

• The term E k (q(t%)) only depends on the efficiency of 
the iteration scheme that starts at (p + (£■£), x (^ +1 )) 
as it represents the local ratio between the value at 
the beginning of the q(t k ) iterations and the value at 
the end and this for a given estimation x{t^ +1 ) of the 
future state. 

• The term D k (q(t%)) represents the effects of uncer- 
tainties and/or the imperfection of the finite horizon 
parameterization. More precisely: 

- the term J^+i/Jk+i is linked to the difference 
between the predicted state x(t k +\) that is used 
in the iterations and the true state x(tk+i) that is 
effectively found at This difference may be 

induced by model mismatches (including unknown 
dynamics of the set-point when the latter is used 



- the term J k / J k represents the model based ratio 
between the new value of the cost function after 
prediction horizon shift by r% and using the shifted 
value of the control parameter p + {t^) and the value 
of the cost function before horizon shift. This ratio 
would be necessarily < 1 in an infinite horizon set- 
ting using classical p.w.c control parameterization. 



Based on (13) it comes clearly that the indicator K™ in 
defined by: 

Kr n := min K k (q) := E k (q)D k (q) (14) 

qe{l,...,N} 

is of great importance since the convergence of the scheme 
may be guaranteed if for all k, one can be sure that 



jqntn < j 



The computation of the feedback control q(t k ) in (12) can 
be rationally done if one disposes at instant t\ of estimated 
models for the maps E k {q) and D k (q) as functions of the 
number of iterations q. Indeed, in this case, the following 
feedback can be used: 

g*(*JJ):= aig min *(</):= (15) 

qe{l,...,N} 

q under K k (q) < 1 If K™ in < 1 



\log(K k (q))\ 
K k {q) 



otherwise 



More clearly, if the constraint K k {q) < 1 is feasible 
(K™ in < 1), then optimization focuses on the settling time 
which is proportional to q/\ \og(K k (q))\ but this minimiza- 
tion is done over those values of q that correspond to a 
contraction. Otherwise the contraction factor is enforced by 
minimizing K k (q). 

To summarize, if models for E k (-) and D k (-) were avail- 
able, equation ( [T5] ) completely defines the feedback law q 
for the extended system ( [T2| ). It may be argued however that 
assuming the availability of E k (q), even through identifica- 
tion (as it has been suggested and shown to be efficient on a 
rather involved nonlinear example in [2]) is a rather strong 
assumption. An alternative method is proposed in the next 
section that enhances a distributed-in-time solution of ( p3] ). 

III. A SUB-OPTIMAL CONTROL UPDATING RULE 
Recall first of all that at the end of the computation period 



[£fc?£fe+i]> one disposes by using (13) of the following local 
estimation of E k (q(t%)) and D k (q!j%)): 



Ek{q{t u k )) 



Jk+l 



D k {q{t u k )) 



Jk + \J k 

Jk+lJk 



(16) 



Moreover, provided that q{t k ) > 2, one can use the interme- 
diate results of the iterations to estimate the gradient of E k 



at q(t^). Indeed, one obviously has: 
AE k 



Aq 



(17) 



Regarding the map Dk( ), one may notice that by definition 
-Dfc(O) = 1. Therefore, the following simple model can be 
adopted for Dk(-): 



D k (q) = l + aP-q 



(18) 



where the coefficient aP is estimated, in accordance with 



<[T6|» by the expression: 



a 



D 



Jk+lJk 



1 



AD, 
Aq 



(19) 



which is nothing but an estimation of the gradient of D k at 
q(t k ). Therefore, using (17) and (19), the sensitivity of K k 
at q(t^) can be evaluated: 



Ai^/c , , „ AD k ^ AE k 



Aq ™ K " K " " " w Aq " Aq 

Moreover, when K k (q(t^)) < 1, the sensitivity of the settling 
time can also be obtained by: 

A(q/\log(K k (q))\) 



K k Aq 



Aq 



pog(#*)] 2 



(21) 



Having all the quantities above at hand, the updating rule for 
q can be given by Algorithm [T] 

Algorithm 1 Updating rule q(t% +1 ) = U(q(t%),t%) 
l: If (K k > 1) then 



Else 

r <- 

End If 



Aq 

A(q/\log(K k (q))\) 
Aq 



[see (17), (19) and (20)] 



[see (21)] 



2(*fc+i) ^ max|2,min{g moa .,g(^) - S • sign(T)}} 



Note that in Algorithm [T] T stands for the gradient of K k 
if the current estimated value of K k is greater to 1 (Step 
2), otherwise, Y is taken to be the gradient of the settling 
time (Step 4). In both cases, a step S £ N in the opposite 
direction is taken and resulting value is projected in [2, q m ax] 
where q m ax < N is the maximum number of iterations being 
allowed. Note that q > 2 is needed for the estimation of 



AE k /Aq to be possible according to (17). 

It is obvious that Algorithm [T] implements a quantified 
gradient method that hopefully reaches a vicinity of the 



optimal solution of (15). This is more likely to occur when 
the unknown efficiency map is monotonically decreasing 
function of q. This is generally not the case in fast gradient 
method unless a restarting strategy is adopted as shown in 
the following section. 



IV. MONOTIC VERSION OF THE FAST GRADIENT 
METHOD 

The fast gradient [5] version of the descent iteration 
invoked in ([5]) leads to the following algorithm: 

Algorithm 2 Fast gradient iterations p = S( q \p, x) 

1: Initialization, p^ 
2: for i = 1, q do 



P 



(0 



^- p, r ^- p 
1 



Pc{r--[VJ(r,x))}) 



4: r ^— p 
5: end for 



c(p 



p 



6: p ^ p 



in which L is an upper bound on the Lypschitz constant 
of the gradient V J of the cost function, namely: 



(20) V(x,pi,p 2 ) , \\WJ(p2,x) - VJ(pi,x)\\ < L\\p2 -Pi\ 



while c £ [0, 1[ is chosen according to the problem 
co nditioning [6] . For quad ratic problems, the choice c = 

= = where H is the problem Hessian 

V /x max (H) + J\ min (H) 

is considered to be optimal. Note that in the case where c = 
is used, the algorithm reduces to the classical pure gradient 
descent. 

The map Pc is the projection map into the set of ad- 
missible values. This map can be made easy to compute by 
working on the dual problem in which the constraints are 
limited to hypercube for which Pc is reduced to a vector 
of element- wise saturation constraints (See for instance [3], 
[4]). 

The convergence of the above scheme depends on the 
choice of c. Indeed, the pure gradient method (c = 0) shows a 
proved convergence rate in 1 ji while an optimal fast gradient 
method decreases the cost function as + 2) 2 . More 
precisely, denoting by J*(x) the optimal value of the cost 
function, one can write: 



J(S {q \p,x),x)- J*{x) < 



a 



£e[l,2] (22) 



Recall that while the pure gradient method leads to mono- 
tonically decreasing cost, the use of c > in the fast gradient 
version very frequently leads to non monotonically decreas- 
ing behavioi^] In [7], the authors showed very elegantly 
that this fact occurs systematically under certain quite mild 
conditions. Moreover, they proposed the following modified 
version of the fast algorithm: 



2 It is worth underlying that this is not incompatible with J22J since the 
later is only a bound on the potential oscillations. 
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Fig. 1. Evolution of the relative decrease given by 



\J(p(°,x)\ 

as a function of the number of iterations q for pure gradient and fast gradient 
without restart or with two different values of the restart counter threshold 
Smax used in Algorithm [3] 



Algorithm 3 Fast gradient iterations p — 
constant restarting strategy 

Initialization. p^> <— p, r <— p, s <— 
for i = 1, q do 

s <- s + 1 

^P C (r- y[VJ(r,x))]) 



S^ q \p,x) with 



c(p 



r ^— p K 

if (s = Smax) then r <— pW, s = End if 
end for 



p <— p 



(<z) 



Note that in this new version, a restarting feature is intro- 
duced each Smax iterations of the original fast algorithm in 
order to avoid too large moment due to increasing difference 
between r and . 

Figure [T] shows a typical evolutions of the ratio: 

J(p( q \x) - J(p(°\x) 
\J(pW,x)\ 

for different configuration of Algorithm [3] namely: The pure 
gradient algorithm (c = 0), the fast gradient with no restart 
(Smax = the fast gradient with s ma x = 5 and 8 

respectively. These curves have been at a randomly chosen 
instant during the scenario depicted in the example studied 
in section |V| Other random instant systematically gave the 
same kind of behavior. 

The main message that can be derived from this obser- 
vation is that provided that the restart technique is applied 
in the case of fast gradient (c ^ 0), the efficiency map is 
monotonic and a gradient descent can be safely used in the 
updating rule for the control updating period following the 
lines used in Algorithm [T] 

V. ILLUSTRATIVE EXAMPLE 

Let us consider the constrained system consisting in a 
triple integrator: 




20 40 60 8 

Evolution of q 





Fig. 2. Closed-loop evolution under the adaptive control updating period 
with initial q = 2, S = 10, qmax = 100. Prediction horizon TV = 200. 



and assume that fast gradient MPC is used to force this 



system to track some reference signal y re f 
the cost function: 



ref 



j 



N 

k=i 



\\y(k)-y™f(k)\\l + \\u(k)\\ 



using 



(24) 



0.02. 



xi = x 2 ; x 2 = x 3 ; x 3 = u 



M < 1 



(23) 



with Q = 100 and R = 1 and a sampling period r 
The fast gradient is used with restart parameter s ma x = 8 
The number of iterations is limited to q ma x = 100 and the 
step size S = 10 is used in the updating rule defined in step 
6 of Algorithm [T] 

Figures [2] and [3] show the behavior of the closed-loop 
using the proposed adaptive algorithm for two different initial 
values q = 2 and q = 100 respectively and using a prediction 
horizon length of N = 200 sampling periods. The results 
clearly show that the updating scheme converges towards 
the same pattern regardless of the initial choice of q. The 
performance of the closed-loop under the proposed adaptive 
strategy is to be compared to those depicted on Figures [4] 
and [5] where constant q = 2 and q = 100 are respectively 
applied to assess the efficiency of the proposed strategy to 
improve the convergence towards the ideal case. 

Observing the behavior of the updating parameter q on 
Figures [2] and [3] suggests that a constant q — 20 would be 
appropriate. This is checked on Figure [6] where indeed a 
constant q = 20 seems to give nice results. Nevertheless, the 
same constant value q = 20 is no more appropriate when 
a prediction horizon N = 100 is used as shown on Figure 
[7] Again, firing the updating mechanism and starting from 
q = 20 enables the new optimal updating parameter to be 
recovered leading to quasi-optimal performance level (Figure 

0- 

VI. CONCLUSION 

In this work a new updating mechanism for the number 
of iterations to be performed in fast gradient based NMPC 
implementation is proposed and assessed using a,n illustra- 
tive example. The method shows nice ability to automati- 
cally recover the optimal performance despite bad a priori 
knowledge on the optimal number of iterations. The updating 
mechanism uses cheap computation based on the behavior of 
the algorithm and the cost function in closed-loop. 
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Fig. 3. Closed-loop evolution under the adaptive control updating period 
with initial q = 100, 5 = 10, q m ax = 100. Prediction horizon N = 200 
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Fig. 7. Closed-loop evolution without the updating mechanism and using 
constant q = 20. Prediction horizon N = 100 
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Fig. 4. Closed-loop evolution without the updating mechanism and using 
constant q = 2. Prediction horizon TV = 200 



Output Evolution 



Control Evolution 




20 40 60 8 
Evolution of q 




Fig. 5. Closed-loop evolution without the updating mechanism and using 
constant q = 100. Prediction horizon N = 200 
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Fig. 8. Closed-loop evolution under the adaptive control updating period 
with initial q = 20, 5 = 10, qmax = 100. The restarting threshold s ma x = 
8 is used in the fast gradient descent. Prediction horizon N = 100 
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Fig. 6. Closed-loop evolution without the updating mechanism and using 
constant q = 20. Prediction horizon TV = 200 



